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ABSTRACT 

The type II restriction endonuclease Tsel recognizes 
the DNA target sequence 5-G A CWGC-3' (where 
W = A or T) and cleaves after the first G to produce 
fragments with three-base 5-overhangs. We have 
determined that it is a dimeric protein capable of 
cleaving not only its target sequence but also one 
containing A:A or T:T mismatches at the central 
base pair in the target sequence. The cleavage of 
targets containing these mismatches is as efficient 
as cleavage of the correct target sequence contain- 
ing a central A:T base pair. The cleavage mechanism 
does not apparently use a base flipping mechanism 
as found for some other type II restriction endo- 
nuclease recognizing similarly degenerate target 
sequences. The ability of Tsel to cleave targets 
with mismatches means that it can cleave the 
unusual DNA hairpin structures containing A:A or 
T:T mismatches formed by the repetitive DNA se- 
quences associated with Huntington's disease 
(CAG repeats) and myotonic dystrophy type 1 (CTG 
repeats). 

INTRODUCTION 

Type II restriction endonucleases have found many uses in 
molecular biology because of their ability to cleave DNA 
molecules with extraordinary precision at specific se- 
quences of base pairs. Thousands of restriction enzymes 
have been discovered, and many are available commer- 
cially (1). The restriction enzyme Tsel, isolated originally 
from a thermophilic bacterium of the genus Thermus, 
displays optimal activity at ~65°C. Tsel recognizes the 
symmetric but ambiguous five-base pair sequence 
5'-G A CWGC-3' in double-strand (ds) DNA (where 
W = A or T) and cleaves after the first G to produce 



fragments with three-base 5'-overhangs (1). ApeKI, a 
similar enzyme of identical specificity from the archaeon 
Aeropyrum, displays optimum activity at even higher tem- 
peratures (2). 

X-ray crystallography has revealed two distinct struc- 
tural classes of restriction enzymes that recognize quasi- 
symmetric DNA sequences of this kind. Restriction 
enzymes Mval (specificity: CC'WGG) and Bcnl 
(CXTSGG where S = G or C) bind to DNA and are 
active as small monomeric proteins (3-5). The proteins 
contain only one catalytic site and accomplish ds 
cleavage by sequential nicking and hydrolysing first one 
DNA strand and then the other in separate binding events 
of opposite orientation (6,7). Each monomer interacts 
with all five base pairs of the recognition sequence. The 
interactions leading to recognition of the four defined base 
pairs are straightforward (3,4), but those leading to recog- 
nition of the central ambiguous base pair are less well 
understood, and thought perhaps to involve reversals in 
hydrogen-bond configurations (6). 

Restriction enzymes PspGI (XCWGG) and Ecll8kl 
( A CCNGG where N = any base), in contrast, bind to 
DNA as homodimeric proteins (8,9). As each subunit 
has a catalytic site, the homodimers have two and can 
accomplish ds cleavage simultaneously, in a single 
binding event, albeit that for Ecll8kl requires additional 
interactions between neighbouring molecules at flanking 
sites (10). Each subunit of PspGI and Ecll8kl interact in a 
conventional way with only half of the base pairs that 
make up the recognition sequence, namely, the two 
outer C:Gs that form each half-site. In both enzymes, re- 
markably, the central bases are flipped from the helix into 
pockets, and the gap left behind is compressed, effectively 
reducing the recognition sequences to just CCGG (8,9). 
Recognition of the central base pair is thought to take 
place mainly within the pockets; these accommodate any 
base in Ecll8kl, but in some way discriminate against G 
or C in PspGI (11). When 2-aminopurine (2-AP) is 
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substituted for adenine at the central position in the rec- 
ognition sequence, binding by PspGI and Ecll8kl 
produces a marked increase in fluorescence because of 
base unstacking, a signature of base flipping (12,13). 

PspGI and Ecll8kl are similar to each other in both 
structure and amino acid (aa) sequence. They are 
distinct from Bcnl and Mval, which themselves are also 
similar in sequence and structure. The structure of Tsel 
has not been solved, but its aa sequence, and that of the 
closely similar ApeKI, shows little similarity to those of 
the other four enzymes, suggesting that Tsel might belong 
to a third structural class, distinct from either of the other 
two. It is unclear, then, whether Tsel binds to its target 
sequence as a monomer or a homodimer, and whether the 
central base pair remains intra-helical during recognition 
or becomes flipped. 

Given its recognition sequence, Tsel will cleave CAG 
and CTG trinucleotide repeats, which are involved in the 
aetiology of a number of neurodegenerative diseases, such 
as Huntington's disease (HD; CAG repeats) and myotonic 
dystrophy type 1 [CTG repeats; reviewed in (14,15)]. 
There is evidence that triplet repeats produce unusual 
DNA structures, such as triplexes, hairpins, slipped-strand 
DNA and G-quadruplexes (16). These structures affect the 
mobility of DNA in agarose gels (e.g. 17), and they have 
also been directly observed by electron microscopy 
(18,19). Recently, we used atomic force microscopy 
(AFM) to analyse DNA samples of various CAG repeat 
lengths (20). We found that the structural profile of the 
DNA changed significantly as repeat length increased. 
DNA from wild-type mice appeared as short linear mol- 
ecules, whereas when the CAG repeat length increased, 
various DNA structures, including convolutions, folds 
and protrusions, became apparent. Over 50% of DNA 
molecules containing 408 CAG repeats contained one of 
these unusual structures. We showed that the convoluted 
DNA was sensitive to mung bean nuclease, indicating that 
it contained hairpin mismatches. We then used Tsel, to 
further characterize the structures observed in the 
super-long CAG repeats. We found that at room tempera- 
ture, Tsel preferentially cleaved linear regions of 
858-repeat DNA, leaving behind the contorted regions. 
In contrast, at 80° C, Tsel completely digested the DNA. 
These observations suggested that Tsel preferentially 
cleaves CAG repeats within normal B-form DNA, but 
that at higher temperatures, it can also cleave DNA con- 
taining central A:A or T:T mismatches; mismatches that 
will be present in the hairpins. We speculated that like 
PspGI, Tsel might also flip out the central bases, and in 
doing so, it might cleave such mismatched sequences by 
accommodating adenine in both pockets, in the case of 
A:A mismatches, or thymine in both pockets, in the case 
of T:T mismatches. In the current study, we set out to test 
this idea. 



MATERIALS AND METHODS 

Tsel enzyme 

The amino acid and DNA sequences for Tsel are available 
in GenBank under accession numbers JN035228 and 



AEN19713. We thank David Hough of New England 
Biolabs for the kind gift of purified protein. The protein 
(378 aa; 5000 U/ml) had a concentration of 10.8 uM in 
terms of Tsel monomers based on a molecular weight of 
41 780 Da and an extinction coefficient at 280 nm of 37930 
M -1 cm -1 . ProtParam (21) at http://web.expasy.org/cgi- 
bin/protparam/protparam was used to calculate these 
values, assuming all cysteine residues were reduced and 
the N-terminal methionine was unprocessed. 

Preparation of 2-aminopurine labelled oligonucleotides 

Apart from oligonucleotides containing 2-aminopurine 
(2-AP), all oligonucleotides were obtained from ATDbio 
(Southampton). Solid-phase synthesis of 2-AP-containing 
oligonucleotides was performed using phosphoramidite 
chemistry on a MerMade DNA/RNA oligonucleotide syn- 
thesiser (BioAutomation, USA) in a 5'-trityl (4,4'- 
dimethoxytrityl, DMT) group-on manner. Purification of 
the synthesised oligonucleotides was done using standard 
two-stage DMT-on/DMT-off reverse phase high- 
performance liquid chromatography (HPLC). The 
DMT-on full-length products possessed a prolonged reten- 
tion time during reverse phase HPLC purification and were 
easily separated from failed sequences. The reaction was 
detritylated with 40% acetic acid/water, and a further 
DMT-off reverse phase HPLC purification step was 
applied for higher purity. A NAP- 10 column (GE 
Healthcare) was used for desalting, and a Speedvac was 
used to concentrate the synthesized products. All 
synthesized oligonucleotides were examined by ESI- 
FTICR mass spectrometry to confirm an accurate molecu- 
lar mass. All the synthesized products were quantified using 
UV-Vis absorbance at 260 nm with the extinction coeffi- 
cient determined using Integrated DNA biophysics online 
software (http://biophysics.idtdna.com/UVSpectrum.html). 

Size-exclusion chromatography analysis for investigating 
molecular mass of Tsel in solution at different protein 
concentrations 

An analytical HPLC gel filtration column calibrated with 
protein standards (apoferritin 443 kDa, (3-amylase 
200 kDa, alcohol dehydrogenase 150 kDa, bovine serum 
albumin 66 kDa and carbonic anhydrase 29 kDa) was 
used to determine the molecular weight of Tsel in a 
buffer composed of 20 mM Tris-HCl, 20 mM 2- 
(iV-morpholino)ethanesulfonic acid (MES), lOmM mag- 
nesium chloride, 7mM fj-mercaptoethanol, 200 mM 
sodium chloride and 0.1 mM ethylenediaminetetraacetic 
acid (EDTA), pH 6.5, at room temperature as previously 
described (22). The low pH is required for stability of the 
silica-based chromatography material. The Tsel was 
tested at concentrations from 4000 to 100 nM. 

Measurement of DNA binding to Tsel by fluorescent 
anisotropy 

The anisotropy duplex labelled at the 5'-end on one strand 
with hexachlororfluorescein (HEX) was used, Table 1. 
Fluorescence anisotropy measurements were performed 
using an Edinburgh Instrument FS900 photon counting 
spectrofluorometer using a T-format measurement and 
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analysed as previously described (23). The excitation 
wavelength was 535 nm, emission wavelength was 
555 nm and bandwidths were 5nm. Eight hundred micro- 
litres of lOnM of the anisotropy duplex was placed in a 
10-mm path length quartz cuvette at 25°C. Small amounts 
of Tsel were added to the duplex solution using a micro- 
litre syringe to achieve enzyme concentrations from 5 to 
320 nM and gently mixed by magnetic stirring. The buffer 
was 50 mM potassium acetate, 20 mM Tris-acetate, 
10 mM calcium acetate and 1 mM dithiothreitol, pH 7.9. 

Steady-state measurements of 2-AP-substituted DNA and 
DNA-Tsel complexes 

The 2-AP-labelled oligonucleotides were synthesized, 
annealed to fully or partially complementary strands 
(Table 1) and used for investigating the interaction of 
Tsel with DNA. All duplexes were buffered with 50 mM 
potassium acetate, 20 mM Tris-acetate, lOmM calcium 



acetate and 1 mM dithiothreitol, pH 7.9. Two hundred 
microlitres of samples of 250 nM duplex plus 1500nM 
Tsel was incubated for 20 min at 25° C. Steady-state fluor- 
escence spectra were measured using a FluoroMax 
(Horiba Jobin Yvon) photon counting spectro- 
fluorometer. Spectra were recorded with a bandpass of 
10 nm for both excitation (315 nm) and emission. 

Fluorescence-based Tsel activity assay 

The substrate duplexes consisted of a 5'-HEX-labelled 
oligonucleotide annealed to a complementary strand 
labelled at the 3'-end with a black hole quencher 1 
group, Table 1. The buffer was 50 mM potassium 
acetate, 20 mM Tris-acetate, lOmM magnesium acetate 
and 1 mM dithiothreitol, pH 7.9. The duplexes are essen- 
tially non-fluorescent in the ds-state and highly fluorescent 
in the single-stranded state achieved by cleavage with Tsel 
at a temperature greater than the melting temperature of 



Table 1. Oligonucleotides and duplexes used in this study 



Duplex name 



Sequence 



2-AP:T duplex 1 

A:2-AP duplex 2 

T:2-AP duplex 3 

2-APA duplex 4 

2-AP:2-AP duplex5 

Anisotropy duplex A:T 

Anisotropy duplex G:C 

Anisotropy duplex A:A 

Anisotropy duplex T:T 

Fluorescence assay A:T duplex 

Fluorescence assay A:A mismatch duplex 

Fluorescence assay T:T mismatch duplex 

Fluorescence assay product duplex 

A:T duplex 

AA duplex 

T:T duplex 

G:C duplex 

G:G duplex 

13mer product sequence 
15mer product sequence 
12mer product sequence 
16mer product sequence 



HEX - 



HEX - 

HEX - 
BHQ1- 
HEX ■ 
BHQ1- 
HEX ■ 
BHQ1- 
HEX ■ 
BHQ1- 



AGGAGTGAAGTCGC2GCCCGTGCTCAAG ( 5 
TCCTCACTTCAGCGTCGGGCACGAGTTC ( 3 
AGGAGTGAAGTCGCAGCCCGTGCTCAAG ( 5 
TCCTCACTTCAGCG2CGGGCACGAGTTC ( 3 
AGGAGTGAAGTCGCAGCCCGTGCTCAAG ( 5 
TCCTCACTTC2GCGTCGGGCACGAGTTC (3 
AGGAGTGAAGTCGC2GCCCGTGCTCAAG ( 5 
TCCTCACTTCAGCGACGGGCACGAGTTC ( 3 
AGGAGTGAAGTCGC2GCCCGTGCTCAAG ( 5 
TCCTCACTTCAGCG2CGGGCACGAGTTC ( 3 
AGGAGTGAAGTCGCAGCCCGTGCTCAAG ( 5 
TCCTCACTTCAGCGTCGGGCACGAGTTC ( 3 
AGGAGTGAAGTCGCGGCCCGTGCTCAAG ( 5 
TCCTCACTTCAGCGCCGGGCACGAGTTC ( 3 
AGGAGTGAAGTCGCAGCCCGTGCTCAAG ( 5 
TCCTCACTTCAGCGACGGGCACGAGTTC ( 3 
AGGAGTGAAGTCGCTGCCCGTGCTCAAG ( 5 
TCCTCACTTCAGCGTCGGGCACGAGTTC ( 3 
AGGAGTGAAGTCGCAGCCCGTGCTCAAG ( 5 
TCCTCACTTCAGCGTCGGGCACGAGTTC ( 3 
AGGAGTGAAGTCGCAGCCCGTGCTCAAG ( 5 
TCCTCACTTCAGCGACGGGCACGAGTTC (3 
AGGAGTGAAGTCGCTGCCCGTGCTCAAG ( 5 
TCCTCACTTCAGCGTCGGGCACGAGTTC (3 
■AGGAGTGAAGTCG ( 5 

TCCTCACTTCAGCGTC (3 
AGGAGTGAAGTCGCAGCCCGTGCTCAAG ( 5 
TCCTCACTTCAGCGTCGGGCACGAGTTC ( 3 
AGGAGTGAAGTCGCAGCCCGTGCTCAAG ( 5 
TCCTCACTTCAGCGACGGGCACGAGTTC ( 3 
AGGAGTGAAGTCGCTGCCCGTGCTCAAG ( 5 
TCCTCACTTCAGCGTCGGGCACGAGTTC ( 3 
AGGAGTGAAGTCGCGGCCCGTGCTCAAG ( 5 
TCCTCACTTCAGCGCCGGGCACGAGTTC ( 3 
AGGAGTGAAGTCGCGGCCCGTGCTCAAG ( 5 
TCCTCACTTCAGCGGCGGGCACGAGTTC ( 3 
AGGAGTGAAGTCG ( 5 

Phos-CAGCCCGTGCTCAAG (5 
GGGCACGAGTTC ( 3 
TCCTCACTTCAGCGTC-Phos (3 



-3' 
-5' 
-3' 
-5' 
-3' 
-5' 
-3' 
-5' 
-3' 
-5' 
-3' 
-5' 
-3' 
-5' 
-3' 
-5' 
-3' 
-5' 
-3' 
-5' 
-3' 
-5' 
-3' 
-5' 
-3' 
-5' 
-3' 
-5' 
-3' 
-5' 
-3' 
-5' 
-3' 
-5' 
-3' 
-5' 
-3' 
-3' 
-5' 
-5' 



Recognition sites of Tsel are in bold. 
HEX, hexachlorofluorescein group; BHQ1, 
analogue base. 



black hole quencher 1; Phos, phosphate group; 2-AP, 2-aminopurine 
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the cleavage products. The increase of the signal can easily 
be quantified and initial reaction rates determined. All 
samples were analysed with an Edinburgh Instrument 
spectrofluorometer with excitation at 530 nm, emission 
at 555 nm and 5nm bandwidths. Each DNA sample 
(100 ill) was placed in a quartz microcuvette with 10-mm 
path length, put in the fluorometer and allowed to reach 
the assay temperature of 60° C, whereupon a stable fluor- 
escence background signal was achieved. The fluorescence 
intensity as a function of time was immediately recorded 
after addition of Tsel to a constant monomer concentra- 
tion of 32.4 nM. Photon counts were converted to the 
amount of 5'-HEX-labelled oligonucleotide product 
released using a calibration curve. The substrate concen- 
tration versus initial cutting rate was plotted and fitted 
using the Michaelis-Menten equation {v = V max (S)/ 
[K M +(S)]} to determine the kinetic parameters K M , 
Vmax and k cat . To determine the rate of melting of the 
products, a small aliquot of the fluorescence assay 
product duplex (1.12 uM), Table 1, was diluted with 
buffer at 60°C in the microcuvette and the fluorescence 
signal recorded. 

Mass spectrometry 

For mass spectrometry analysis of oligonucleotides, 
LC-MS was used, with detection performed in the 
negative mode. Briefly, an Ultimate 3000 HPLC system 
was used (Dionex Corporation, Sunnyvale, CA, USA), 
equipped with an Aeris Widepore CI 8 reverse phase ana- 
lytical column 50x2.1 mm (Phenomenex). Six-microlitre 
samples containing ~5 uM of DNA duplex or 
Tsel-treated duplex were injected onto the column. 
For chromatography, mobile phase A and B were 
prepared comprising 98:2 watenammonia and 49:49:2 
water:MeOH:ammonia, respectively. Samples were 
injected onto the analytical column, washed with mobile 
phase A for lOmin, followed by a 20-min linear gradient 
elution (200ul/min) into 100% mobile phase B. MS data 
was acquired on a Bruker 12 Tesla SolariXQe Fourier 
Transform Ion Cyclotron Resonance (FT-ICR) instru- 
ment (Bruker Daltonics, Billerica, MA, USA) equipped 
with an electrospray ionization (ESI) source. Desolvated 
ions were transmitted to a 6-cm Infinity cell Penning trap. 
Trapped ions were excited (frequency chirp 48-500 kHz at 
100 steps of 25 us) and detected between mjz 600 and 2000 
for 0.5 s to yield a broadband 512 Kword time-domain 
data set. Fast Fourier Transforms and subsequent 
analyses were performed using DataAnalysis (Bruker 
Daltonics) software. Isotope distributions of specific 
charge states were predicted from theoretical empirical 
formulas. These were overlaid on the recorded 
experimental data. 

Polyacrylamide gel showing the cleavage of matched and 
mismatched duplexes by Tsel 

Unlabelled duplexes, Table 1, were used in this experiment 
at a concentration of 10 uM. In all, 2.5 ul of Tsel stock 
solution (10.8 uM) was added to give a final Tsel concen- 
tration of 0.26 uM. The buffer was 50 mM potassium 
acetate, 20 mM Tris-acetate, lOmM magnesium acetate 



and 1 mM dithiothreitol, pH 7.9. All samples were 
incubated at 65°C for ~6h in the reaction buffer and 
then 2.5 ul run on a 15% polyacrylamide gel in lx Tris/ 
Borate/EDTA buffer at 150 V, stained with SYBR Gold 
and viewed under UV light. Single strands corresponding 
to the reactants and products, Table 1, were run as 
markers. The DNA marker ladder was O'RangeRuler 
five base pairs DNA Ladder (Thermo Scientific). 

Denaturing HPLC analysis for Tsel-matched and 
-mismatched cutting 

Unlabelled duplexes, Table 1 , were used in this experiment 
at a concentration of 10 uM. When cleavage was to be 
tested, 2.5 ul of Tsel stock solution (10.8 uM) was added 
to 100 ul of the duplex solution to give a final Tsel con- 
centration of 0.26 uM. All samples were incubated at 65°C 
for ~12h. The buffer was 50 mM potassium acetate, 
20 mM Tris-acetate, 10 mM magnesium acetate and 
1 mM dithiothreitol, pH 7.9. Twenty-microlitre samples 
were injected onto the HPLC column. The injection 
solely of Tsel in the absence of the duplex gave no 
HPLC signal above the baseline. A Gilson HPLC 
equipped with absorption detection at 254 nm was used 
for analysis of DNA cleavage. The reverse phase column 
(CI 8 Jones Chromatography) was thermostatted at 65°C. 
A linear acetonitrile gradient from 5 to 65% acetonitrile 
was generated by mixing 0.1 M acetic acid, 5% acetonitrile 
and 0.1 M acetic acid and 65% acetonitrile aqueous 
solution. The pH of the two solvents was adjusted to 6.5 
with triethylamine. 

RESULTS 

Characterization of Tsel 

A preparation of Tsel was analysed in SDS-PAGE 
(Figure la) and by size-exclusion chromatography 
(Figure lb and c). The SDS-PAGE gel showed a single 
band after Coomassie blue staining at a molecular weight 
of ~42kDa close to that expected from the amino acid 
sequence. Further analysis with a calibrated analytical 
size-exclusion chromatography column showed a single 
elution peak at a constant elution time irrespective of 
protein concentration in a range from 100 to 4000 nM. 
The calibration indicated a molecular weight of 
100 ± 18kDa, suggesting that Tsel is a homodimer in 
the buffer used (Figure lc). 

Binding of Tsel to HEX-labelled duplexes containing 
the target sequence or A:A, T:T or G:C base pairs at the 
middle of the sequence was examined using the increase in 
fluorescence anisotropy of the HEX label caused by the 
slowing of molecular rotation when the mass of the duplex 
is increased by protein binding (23), Figure Id. Binding to 
the target sequence or to the G:C duplex occurred follow- 
ing a normal one-site ligand-binding equation until a 
protein concentration of ~120nM was reached. Above 
this concentration, an additional binding event became 
visible. Binding to the A:A and T:T mismatched 
duplexes was weaker than to the properly base paired 
duplexes, but it again followed a one-site binding 
equation until a protein concentration of ~220 nM was 
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Figure 1. Analysis of Tsel enzyme, (a) 4-12% gradient SDS-PAGE gel 
stained with Coomassie blue. Lane 1 shows the molecular mass markers 
(kDa); lanes 2-A show Tsel samples at 5.4, 2.7 and 1.35 uM, respect- 
ively, (b) Size-exclusion chromatography, as monitored by fluorescence 
emission at 350 nm with excitation at 295 nm, to investigate the solution 
molecular mass. The elution profile (lOOOnM shown) showed a single 
peak at ~6.5min corresponding to a molecular mass of 100 kDa. (c) 



reached. Above this concentration, an additional binding 
event became visible as the anisotropy increased further. 
These additional binding events can only be due to an 
increase in the amount of a species with even greater mo- 
lecular mass than the complex of one DNA duplex with 
one enzyme molecule. We attribute this additional binding 
event to non-specific binding of an extra copy of the 
protein to the protein-DNA complex. In support of this, 
we note that the final values of the anisotropy were similar 
to those previously observed for HEX-labelled DNA 
binding to M.EcoKI, an enzyme of molecular weight 
170 kDa (23). The dissociation constants obtained by 
fitting the one-site binding equation to data up to 
120nM protein concentration gave values of 176 ± 7, 
149 ± 6, 279 ± 9 and 347 ± 30 nM for the duplexes con- 
taining A:T, G:C, A:A and T:T base pairs, respectively, at 
the central position of the target. Thus, Tsel binds less 
well to the distorted duplexes containing the mismatched 
base pair. It is of interest that the binding of Tsel to the 
G:C duplex is almost identical to its binding to the A:T 
duplex, even though the former sequence lacks the recog- 
nition sequence for the enzyme. Thus, these data also 
suggest that Tsel must acquire its sequence specificity 
after the binding event but before or concomitant with 
the cleavage event. 

Several restriction enzymes recognizing DNA target se- 
quences similar to that recognized by Tsel have been 
shown by crystallography and 2-AP fluorescence studies 
to flip out the central base pair in the target sequence and 
to collapse the DNA duplex, thus converting a five-base 
pair recognition sequence into a four-base pair recognition 
sequence. The intensity increases in 2-AP emission because 
of enzyme binding, and base flipping ranged from 10- to 
1000-fold for these enzymes. No structure is available for 
Tsel; therefore, the possibility that Tsel uses base flipping 
was investigated by replacing the central bases with 2-AP 
either paired with T (2-AP:T) or mismatched with A 
(2-AP:A) or with another 2-AP (2-AP:2-AP). As a 
control, 2-AP was also placed outside of the target 
sequence and paired with T. The fluorescence of the 
duplexes containing 2-AP showed a typical 2AP 
emission spectrum when excited at 3 1 5 nm with emission 
maximum at 370 nm, Figure le. The addition of excess 
Tsel to ensure near complete binding of the duplex only 



Figure 1. Continued 

Dependence of molecular mass as a function of protein concentration 
injected onto the column, (d) Fluorescence anisotropy increase for 
10 nM hexachlorofluorescein-labelled 28-bp DNA duplex as a 
function of Tsel monomer concentration. Duplex with the target 
sequence (open squares, bold solid line), duplex with an A:A 
mismatch (open circle, solid line), duplex with G:C at centre of the 
target (solid squares, bold dashed line) and duplex with T:T 
mismatch (solid circles, thin dashed line). Lines are determined using 
a one-site binding equation for data up to a Tsel concentration of 
120 nM. (e) Steady-state fluorescence emission spectra of 250 nM 
2-aminopurine-substituted DNA in the absence/presence of 1.5 uM 
Tsel enzyme. The solid lines represent the fluorescence intensity of 
DNA in the absence of Tsel for duplexes 2-AP:T duplex 1 (spectrum 
1), A:2-AP duplex 2 (spectrum 3) and T:2-AP duplex 3 (spectrum 5). 
The dotted lines represent the fluorescence intensity of the same 
duplexes in the presence of excess Tsel (spectra 2, 4 and 6, respectively). 
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Figure 2. Fluorescence-based assay of Tsel activity, (a) The 5'-HEX- 
labelled top strand was annealed with 3'-Black Hole Quencher (BHQ) 
1-labelled strand and became highly quenched. Adding Tsel at elevated 
temperature (60°C) results in cleavage of the duplex, separation of the 
fluorophore-quencher pair and the appearance of fluorescence, (b) 
Fluorescence intensity as a function of time. Initially, a low signal 
was observed. After 30 s, the sample chamber was opened, and an 



caused a small increase (up to ~50%) in the emission in- 
tensity without changing the emission maximum wave- 
length. This change in intensity was observed for all 
locations and base pairings of the 2-AP probe. We 
suspect that the modest intensity changes observed are 
due to minor distortion of the DNA by the enzyme and 
the placing of the 2-AP in a more hydrophilic environment 
rather than base flipping per se. However, it is also 
possible that the presence of an aromatic aa, such as tryp- 
tophan, in proximity to base-flipped 2-AP could quench 
the 2-AP fluorescence and disguise base flipping. In this 
case, its fluorescence would be similar in magnitude to that 
observed for the duplex with 2-AP located outside the 
recognition site. However, in the absence of further struc- 
tural information, we take these results to suggest that 
base flipping does not occur when Tsel binds to its 
DNA target. 

A continuous fluorescence-based assay demonstrates that 
Tsel cleaves its target sequence even if it contains an A:A 
or T:T mismatch 

The endonuclease activity of Tsel on short DNA duplexes 
containing the target sequence and variations thereof was 
investigated using a continuous fluorescence assay. The 
assay uses the difference in thermal stability of the 28 bp 
substrate and the shorter products, which will be single 
stranded at the assay temperature, to give a spectroscopic 
signal, Figure 2a. This assay was initially proposed by 
Waters et al. (23), who used the increase in absorption 
because of the melting of the short products, and was 
subsequently converted to fluorescence measurements by, 
for example, Li et al. (24), who used the melting of the 
short products to remove a fluorescence quencher from 
contact with a fluorescence HEX reporter. Provided that 
the assay temperature lies between the melting tempera- 
ture of the substrate and the products, the products melt 
on cleavage by the enzyme, and the fluorescence of the 
fluorophore is greatly enhanced. This assay works well 
for Tsel, Figure 2b, showing a substantial increase in 
fluorescence from a low-background level as a function 
of time after addition of the enzyme. The fluorescence 
assay product duplex melts faster than the cleavage of 
the substrate by the enzyme with melting complete after 
~60 s, Figure 2c. Thus the melting rate of the product does 
not limit the assay. 

The initial rate of fluorescence increase was determined 
as a function of substrate concentration for both a duplex 
containing the target sequence and for one containing an 
A:A or T:T mismatch in the middle of the sequence. All 
three duplexes were cleaved by the enzyme, and in fact, the 



Figure 2. Continued 

aliquot of Tsel was added to 100 nM duplex (fluorescence assay A:T 
duplex, Table 1). After closing the sample chamber at 50 s, Tsel caused 
a rapid increase in signal, (c) Time dependence of melting of the fluor- 
escence assay product duplex at 60°C in the absence of any Tsel. (d) 
Michaelis-Menten plots of Tsel cleavage for both matched (open 
circles), A:A-mismatched DNA substrate (black circles) and 
T:T-mismatched DNA substrate (open square). Error bars are 
standard deviations for experiments performed in triplicate. 
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mismatched duplexes were cleaved more quickly, as shown 
in the Michaelis-Menten analysis, Figure 2d. The 
maximum velocities for cleavage were 3.2 ± 0.2, 
6.0 ± 0.8 and 5.4 ± 0.3 nM s" 1 for the normal duplex, 
for the A:A mismatched duplex and the T:T mismatched 
duplex, respectively, giving k cal values of 0.20 ± 0.01, 
0.37 ± 0.05 and 0.33 ± 0.02 s _1 , respectively, assuming 
100% of the enzyme molecules were active as dimers. 
The Michaelis constants, K M , were 42.2 ± 9.3, 
80.7 ± 30.6 and 95.2 ± 14.8 nM for the normal duplex, 
the A:A mismatched duplex and the T:T mismatched 
duplex, respectively. The values of k cat jK M were, there- 
fore, 4.0 ± 0.95 x 10" 3 , 4.6 ± 3.4 x 10" 3 and 
3.5 ± 0.6 x 10~ 3 nM -1 s _1 ; hence, the enzyme has no pref- 
erence for one substrate over the other, and the weaker 
dissociation constant observed in the anisotropy assay for 
the mismatched sequences is mirrored in the poorer K M 
values. 

Mass spectrometry confirms that Tsel cleaves mismatched 
DNA duplexes at the same location as on normal DNA 

FT-ICR mass spectrometry was used to identify the 
nature of the cleaved DNA products. The uncleaved 
duplexes after denaturation and separation by reverse 
phase HPLC gave the expected molecular masses, 
Table 2 and Supplementary Figures SI and S2. 
Treatment of the A:T, A:A and T:T duplexes with Tsel 
produced four shorter DNA strands as products. Each of 
these products corresponded exactly to the mass expected 
if cleavage occurred at the normal location for Tsel, 
Table 3 and Supplementary Figures S3 and S4. 

Tsel cleaves target sequences containing A:A and T:T 
mismatches but not sequences containing G:C or G:G at 
the central position of the target sequence 

The ability of Tsel to cleave target sequences containing a 
mismatch was unexpected and would indicate a new use 
for Tsel in investigating A:A and T:T mismatches 
generated by the formation of hairpins in repetitive 
DNA sequences such (CAG) n found in many genetic 
diseases. For this reason, the action of Tsel on further 
mismatched duplexes was explored. 

Rather than perform full enzyme activity studies on 
each possible mismatch using the fluorescence assay, we 
performed gel assays, Figure 3, and HPLC assays, 
Supplementary Figure S5, for cleavage on duplexes con- 
taining various base pairs or mismatches in the central 
position of the target sequence. Figure 3 shows that 
28-bp duplexes containing mismatches of A or T were 
all cleaved into shorter duplexes as effectively as the 
normal cognate sequence, but that those duplexes contain- 
ing G or C at the central position were not cleaved as 
expected, as they lacked the target sequence. Cleavage 
was complete for the A:T , T:T and A:A duplexes as 
would be predicted from the kinetic parameters 
determined in the continuous fluorescence assay. 

HPLC analysis of duplex substrates and products using 
reverse phase chromatography (run at high temperature to 
denature the DNA) was performed, Supplementary 
Figure S5. The substrate duplexes did not denature on 
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the column and eluted as a single peak for all substrates 
investigated, namely, A:T, A:A, T:T, G:C and G:G. The 
shorter product strands eluted as poorly resolved peaks at 
an elution time substantially lower than that of the 
uncleaved DNA strands; consequently, cleaved DNA 
could be clearly distinguished from uncleaved DNA. The 
28-base single strands eluted at the same time as the full- 
length duplexes or slightly later. The absorption elution 
profiles, Supplementary Figure S5, clearly demonstrated 
that duplexes containing the normal target sequence or 
AA or T:T mismatches were cleaved. The cleavage 
seemed to be incomplete for these three duplexes in 
contrast to the assay shown in Figure 3, where the A:T, 
A:A and T:T duplexes were completely cleaved. This may 
be due to the enzyme remaining bound to a fraction of the 
cleavage products and slowing their elution. Duplexes 
lacking the target site or containing G:G or C:C 
mismatches present at the central base pair of the target 
sequence were not cleaved. Duplexes containing 2-AP 
instead of A were also cleaved when paired with T or 
mismatched with A or 2-AP (2-AP:T duplex 1, A:2-AP 
duplex 2, 2-AP:A duplex 4 and 2-AP:2-AP duplex 5 
from Table 1) (data not shown). 



DISCUSSION 

Data presented here show that Tsel cleaves not only its 
cognate sequence GCWGC but also sequences in which 
the central base pair is an A:A or T:T mismatch. In the 
apparent absence of base flipping of the central base pair, 
as observed for other restriction enzymes recognizing 
similar sequences (8,9,11-13), the mode of recognition of 
the central base pair must be rather unusual for discrim- 
ination of a base pair containing A and T from one con- 
taining G and C. Normally, discrimination of A:T/T:A 
from G:C/C:G would use minor groove interactions and 
not rely on DNA distortions, such as base flipping. 
However, an A:A or G:G mismatch will distort the 
DNA helix at the centre of the Tsel target sequence to 
approximately similar degrees; therefore, it is difficult to 
rationalize how this enzyme can recognize the central base 
pair in such a way as to disfavour G and C. Perhaps the 
duplex undergoes a degree of bending at the central base 
pair to provide the necessary discrimination. However, 
solving this problem will require X-ray crystallography 
and would reveal whether Tsel and the related ApeKI 
had a third type of fold distinct from the other restriction 
enzymes recognizing similarly degenerate targets (3-13). 

The cleavage properties of Tsel allow it to digest not 
only CAG repeat sequences in normal duplex DNA but 
also the mismatched sequences caused by hairpin extru- 
sion in long tracts of such repetitive DNA. In fact, we 
observed this dual action of Tsel recently using AFM 
imaging of the degradation of the highly convoluted 
DNA structures formed by annealing DNA strands con- 
taining long CAG repeats (20). The motivation for this 
previous study was to understand the change in DNA 
structure as CAG repeat length increased, a change that 
might be relevant to the aetiology of HD. In HD, the 
presence of an expanded CAG repeat in the coding 
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Figure 3. Polyacrylamide gel analysis of matched and mismatched DNA duplexes being cut by Tsel. Lane M was the molecular mass marker with 
10, 15, 20 and 30 bp DNA indicated. DNA cleavage of A:T, T:T and A:A duplexes by Tsel yields four 12-16mer single-strand oligonucleotides 
matching those shown in the products lane. 



region of HTT, the gene encoding Huntingtin, produces 
an expanded polyglutamine sequence in the expressed 
protein, thereby rendering it neurotoxic (14). The thresh- 
old CAG repeat length that causes HD is ~36, and there is 
an approximate correlation between the length of the 
CAG tract and the age of onset of the disease (25-27). 
Confounding this relationship is instability of repeat 
length, which occurs both in humans (28-32) and in 
mouse models of HD (33-37). In fact, super-long CAG 
expansions up to 1000 have been found in neurons from 
the brains of patients with adult-onset HD (31,35). 
Interestingly, it was shown recently that mice carrying 
the first exon of the human HD gene (R6/2 strain) with 
a super-long (~500) CAG repeat were found to have a 
delayed onset of the disease phenotype and prolonged 
survival compared with mice with shorter repeats 
(38,39). There is a pronounced (>60%) reduction in 
both mRNA levels and expression in mice with more 
than ~335 CAG repeats as compared with mice with 
150 repeats (39), which might contribute to the observed 
phenotypic amelioration. We speculated that super-long 
DNA adopts unusual structures that progressively 
reduce the efficiency of transcription of the HTT gene, 
and we went on to observe the presence of these structures 
by AFM imaging (20). Significantly, Tsel preferentially 
cleaved normal duplex DNA and digested the mismatched 
DNA only at higher temperatures (80°C). In the present 
work, although Tsel showed no preference for digestion of 
base paired over mismatched substrates at 60°C, it did 
cleave mismatched substrates faster; hence, the tempera- 
ture dependence of Tsel action is different on the two 
substrates. By judicious choice of several incubation tem- 
peratures, therefore, it should be possible to use the 
enzyme to discriminate between the two forms of DNA. 
For this reason, we suggest that Tsel should be a useful 



tool, complementary to the use of the EcoP15I restriction 
enzyme (40), for exploring the role of DNA structure in 
triplet repeat expansion diseases, such as HD and 
myotonic dystrophy type 1. 

SUPPLEMENTARY DATA 

Supplementary Data are available at NAR Online: 
Supplementary Figures 1-5. 
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